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The predictions that quantum theory makes about the outcomes of measurements are generally 
probabilistic. This has raised the question whether quantum theory can be considered complete, or 
whether there could exist alternative theories that provide improved predictions. Here we review 
recent work that considers arbitrary alternative theories, constrained only by the requirement that 
they are compatible with a notion of "free choice" (defined with respect to a natural causal structure). 
It is shown that quantum theory is "maximally informative", i.e., there is no other compatible 
theory that gives improved predictions. Furthermore, any alternative maximally informative theory 
is necessarily equivalent to quantum theory. This means that the state a system has in such a 
theory is in one-to-one correspondence with its quantum-mechanical state (the wave function). In 
this sense, quantum theory is complete. 



I. INTRODUCTION 

In this article we look at the question of whether quan- 
tum theory is optimal in terms of the predictions it makes 
about measurement outcomes, or whether, instead, there 
could exist an alternative theory with improved predic- 
tive power. This was much debated in the early days 
of quantum theory, when many eminent physicists sup- 
ported the view that quantum theory will eventually be 
replaced by a deeper underlying theory. Our aim will be 
to show that no alternative theory can extend the pre- 
dictive power of quantum theory, and hence that, in this 
sense, quantum theory is complete. 

Before turning to this question, it is worth reflecting 
on why one might think that quantum theory may not 
be optimally predictive. A key factor is that the the- 
ory is probabilistic. This is in stark contrast with classi- 
cal theory, which is deterministic at a fundamental level. 
Even in classical theory there are scenarios where we may 
assign probabilities to various events, for example when 
making a weather forecast. However, this isn't in conflict 
with our belief in underlying determinism, but, instead, 
the fact that we assign probabilities simply reflects a lack 
of knowledge (about the precise value of certain physical 
quantities) when making the prediction. By analogy, we 
might imagine that even if we know the quantum state 
of a system before measurement (i.e., its wave function), 
we are also in a position of incomplete knowledge, and 
that additional knowledge might be provided in a higher 
theory. 

A further argument for incompleteness was given by 
Einstein, Podolsky and Rosen (EPR) pQ. They argued 
that whenever the outcome of an experiment can be pre- 
dicted with certainty, there should be a counterpart in 
the theory representing its value. They then consider 
measurements on a maximally entangled pair. In this 
scenario, the outcome of any measurement on one mem- 
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ber of the pair can be perfectly predicted given access to 
the other member. Since the particles can be far apart, 
a measurement on one shouldn't, say EPR, affect the 
other in any way. They hence argue that there should 
be parts of the theory allowing these perfect predictions 
and, hence, that the quantum description is incomplete. 

Following EPR, one might hope that quantum theory 
can be explained in terms of an underlying determinis- 
tic theory. Such a view was put into doubt by Kochen 
and Specker [2] and by Bell [3] who showed that an un- 
derlying deterministic theory is not possible if one de- 
mands non-contextuality and freedom of choice. (A non- 
contextual theory is one in which the probability of a 
particular measurement outcome occurring depends only 
on the projector associated with that outcome, and not 
on the entire set of projectors that specify the measure- 
ment according to quantum theory.) Furthermore, in a 
second work, Bell [4] showed that an underlying theory 
cannot be compatible both with local determinism and 
with freedom of choice (we will explain this in more de- 
tail in Section [v]). It is also worth noting that the first of 
these assumptions, local determinism, can be seen as a 
physical means of justifying particular non-contextuality 
conditions. 

The results by Kochen and Specker [2] and by Bell [21 0] 
rule out a large set of deterministic theories (those that 
are non-contextual or locally deterministic). However, 
they leave open the possibility of an alternative theory 
that enables improved predictions over those of quantum 
theory, but which may still be probabilistic. As a toy 
example, one might imagine an extension of quantum 
theory in which the quantum state is supplemented by 
an additional parameter Z . When measuring one half of 
a maximally entangled pair of qubits, it could be that if 
Z = the extended theory assigns outcome with prob- 
ability 3/4, and outcome 1 with probability 1/4, while, if 
Z = 1, the extended theory assigns outcome with prob- 
ability 1/4, and outcome 1 with probability 3/4. The ex- 
tended theory would thus provide more information than 
quantum theory, which predicts that both outcomes oc- 
cur with probability 1/2. Furthermore, if Z is uniformly 
distributed, the quantum predictions are recovered when 
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Z is unknown (and hence the extended theory is compat- 
ible with quantum theory). 

We note that this particular example is rather artifi- 
cial and its purpose is merely to illustrate that — in prin- 
ciple — a theory that is more informative than quantum 
theory is conceivable. However, there are historical prece- 
dents of this type, for instance related to the problem of 
determining the mass of chemical elements. Take, as an 
example, the atomic mass of chlorine. Before the discov- 
ery of isotopes, its atomic mass was thought to be 35.5, 
and the standard measurement techniques of the time 
confirmed it as such. However, it was later discovered 
that chlorine in fact naturally occurs as two isotopes with 
atomic masses 35 and 37 (in approximate ratio 3:1). By 
introducing isotopes, the theory was extended in such a 
way that the mass of an individual atom could be better 
predicted. Note that the predictions made before the dis- 
covery of isotopes were not incorrect, but are simply the 
natural ones to make without knowledge of the different 
isotopes (and hence the new theory is compatible with 
the old one). 

Returning to quantum theory, various alternatives, 
motivated more physically than our earlier toy example, 
have been proposed in the past, some of which we will re- 
view later (see Section |V|). Similarly to quantum theory, 
these alternatives provide rules to compute predictions 
for future measurement outcomes, based on certain (ad- 
ditional) parameters. 

The aim of this article is to explain recent results re- 
lating the predictive power of quantum theory to that of 
possible alternative theories [SJ [B] . For this, we first need 
to specify what we mean by "quantum theory" and by 
"alternative theories" , and how they can be compared 
(Section |IIip . The central requirement we impose on any 
alternative theory is that it be compatible with a no- 
tion of "free choice" . This means that the theory can be 
applied consistently to settings where measurements are 
chosen independently of pre-existing events (Section |IV| . 
We then revisit some standard results, in particular by 
Bell, which impose constraints on any alternative theory 
that is compatible with quantum theory; for instance, 
that no such theory can be locally deterministic (Sec- 
tion |V| . The last sections are then devoted to the re- 
cent, more general, results. A central claim is that no 
alternative theory that is compatible with quantum the- 
ory can improve the predictions of quantum theory (Sec- 
tions VI and VII). Furthermore, if such an alternative 
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Notation 



theory is also at least as informative as quantum theory, 
then it is necessarily equivalent to quantum theory (See- 
In this sense, quantum theory is complete. 



VIII) 



tion 

We conclude with a discussion of how these results re- 
late to known hidden- variable theories, in particular the 
de Broglie-Bohm theory, and mention some applications 
(Section [IX|. 



On a technical level, the main results presented in this 
article are theorems about random variables (RVs) whose 
(joint) probability distribution satisfies certain assump- 
tions. We will only use RVs with discrete range. In the 
following we introduce our notation for such RVs and 
their distributions. 

We usually use upper case letters to denote RVs, while 
lower case letters specify particular values they can take. 
Thus, X — x means that the RV X takes the value x. 
We write P x to denote the probability distribution of 
the RV X, with Px(x) being the probability that X = 
x. For two RVs, X and Y, Pxy represents their joint 
distribution. We also use Px\y : = Pxy/Py to represent 
the conditional distribution of X given Y. This is defined 
for all y such that Py(v) > 0. For any such y, we write 
Px\Y=y '■= Px\y('tU) t° denote the distribution of the 
RV X conditioned on Y — y. We often abbreviate this 
distribution to Px\y We also use P{X = Y) to denote 
the probability that the RVs X and Y have equal values, 
i.e. P(X = Y) := J2x P Xy{x,x) and, likewise, P(X ^ 
Y) := 1 -P(X = Y). 



B. Distance between probability distributions 

Our technical argument uses the variational distance 
to quantify the closeness of two probability distributions. 
For two distributions, Px and Qx, it is defined by 

D(P x ,Q x ):=l^2\P x (x)-Q x (x)\. 



This measure is connected to the distinguishability of the 
two distributions. Specifically, suppose we have a black 
box that samples either from P x or Qx ■ Then, given one 
sample, the maximum probability of successfully guess- 
ing whether the sample has been generated from Px or 
Qx equals §(1 + D(P X ,Q X )). Thus, if two distributions 
are close in variational distance, they are virtually indis- 
tinguishable. Appendix [A] summarizes some properties 
of D(-,-) that are used in this work. 



C. Measuring correlations 

A useful approach towards characterizing alternative 
theories is to consider the correlations (between the out- 
comes of two distant measurements) that can be repro- 
duced by a given theory. The strength of these correla- 
tions may then, for instance, be compared to those oc- 
curring in quantum theory. To quantify correlations, we 
use a measure that has been proposed by Pearle [7] and, 
independently, by Braunstein and Caves [5], based on 
earlier work by Clauser, Home, Shimony, and Holt [3]. 
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The correlation measure is tailored to a specific bipar- 
tite setup where measurements are carried out at two 
separate locations. One of the measurements is speci- 
fied by a parameter A and has outcome X. The other 
is specified by a parameter B and has outcome Y . It is 
furthermore assumed that the outcomes X and Y take 
values from the binary set {0, 1} and that the param- 
eters A and B are labelled by elements from the sets 
{0,2,..., 2N- 2} =: A N and {1, 3, . . . , 2N - 1} =: B N , 
respectively, where N is an integer. The correlation mea- 
sure, in the following denoted by ijv, is then defined by 

In{Pxy\ab) ■■= P{X = Y\A = 0,B = 2N — 1) + 
J2 P(X^Y\A = a,B = b). 

\a-b\=l 

Note that the measure only depends on the conditional 
distribution Pxy\ab- 

We will be particularly interested in the correlations 
that quantum theory predicts for measurements on two 
maximally entangled two-level systems. To specify these 
correlations, let 

|V>o) := ^(|tt> + |W>) , 

where {|f) , is an orthonormal basis. Furthermore, 
define |0) = cos \ |f) + sin § \\), and take E% to be the 
projector onto \(^ + x)it) and, likewise, F y to be the 
projector onto \(^ +y)ir), as shown in Figure [TJ We 
then define PxY\ABip as ^ e con ditional distribution of 
the outcomes of two separate quantum measurements, 
specified by {E x } x and {F y } y , respectively, applied to 
two separate subsystems with joint state fyo), i- e -i 

PxY\aHo^V) ^ ^»\E a x ® F b y \^) • 

It is easy to verify that the correlation strength, quanti- 
fied with the above correlation measure, In, equals 

I N (P^ YlAB ^ = 2Nsin^<g-. (1) 



III. QUANTUM AND ALTERNATIVE 
THEORIES 

The aim of this article is to make statements about 
physical theories, i.e., quantum theory as well as possi- 
ble alternatives to it. However, in order to derive our 
result, we do not need to provide a comprehensive math- 
ematical definition for the concept of a "physical theory" . 
Rather, it suffices to focus on one crucial feature that we 
expect any theory to have, namely that it allows us to 
compute predictions about values that can be observed 
(e.g., in an experiment). These predictions, which need 
not be deterministic, are generally based on certain pa- 
rameters that characterize the (experimental) setup, i.e., 




b=2N-\ 



FIG. 1: Depiction of the measurements used for the defini- 
tion of the correlation measure In- The circle represents the 
, T7j(|t) + I4-))} plane of the Bloch sphere. The arrows depict 
the Bloch vectors associated with the outcome (i.e. E§ or Fq 
are the projectors onto these states). Those for the 1 outcome 
lie in the opposite direction and are not depicted. The correla- 
tion measure 7jv depends on the probability of obtaining identical 
outputs when measuring two subsystems in neighbouring bases. 

how it has been prepared (its initial state), the evolution 
it undergoes, and which measurements are going to be 
applied. 

A. Predictions of quantum theory 

In quantum theory, given the state, 'J, of a system as 
well as a specification of the measurement process, A, & 
prediction about an experimentally observable value, X, 
can be obtained from Born's rule. The state VP may be 
given in the form of a density operator on a Hilbcrt space 
H and any measurement process A — a can be charac- 
terized by a Positive Operator Valued Measure (POVM) 
on H, i.e., a family of positive operators {E x } x labelled 
by the possible measurement outcomes x £ X such that 
^2 X E® = 1%. (In this work, we assume for simplicity 
that the set X is finite.) 

For our treatment, we will assume that any evolution 
of the system prior to the measurement {E x } x is already 
accounted for by its quantum state, i.e, that $ = ifj is 
the state of the system directly before the measurement 
is applied. 1 The predictions that quantum theory makes 
about the measurement outcome X can then be repre- 
sented as a conditional distribution Px\a^> which is given 
by 

Px\ a *(x)=te(E%il>) Vx€A\ (2) 



Alternatively, one may work in the Heisenberg picture, for in- 
stance, and use the POVM to account for the evolution. 
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We note that, by considering an extension of the Hilbert 
space "H, we may describe any quantum- mechanical mea- 
surement process equivalently as a projective measure- 
ment, i.e., one for which the POVM {E%} x consists of 
orthogonal projectors. 2 Furthermore, we call a set of 
POVMs {E®} on H tomographically complete if the val- 
ues Px\aipi x ) f° r all a an d x are sufficient to determine 
ip on H uniquely. 3 

For later reference, we also note that, according to 
quantum theory, any possible evolution of a quantum 
system, S, corresponds to a unitary mapping on a larger 
state space (that may include the environment of the sys- 
tem). In the case of a measurement process, this larger 
state space includes the measurement device, D. Specif- 
ically, a projective measurement, say {E%} x , would cor- 
respond to a unitary of the form 

M "->£V^I^)s®|3!>£). 
x 

where are orthonormal states of the measurement 

device (and possibly also its environment) that encode 
the outcome. The outcome X of the original measure- 
ment may then be recovered by a subsequent projective 
measurement on D in the basis { | £)£>}• 



B. Predictions of alternative theories 

In an alternative theory, the measurement process A 
with outcome X, as described above in terms of the quan- 
tum formalism, may admit a different description. This 
description could involve other parameters, which we de- 
note by Z (one might think of Z as the list of all param- 
eters used by the theory to describe the system's state 
before the measurement A is chosen). 4 For any values 
A = a and Z = z of these parameters, the theory spec- 
ifies a rule for computing the probability distribution, 
Px\az, f° r the measurement outcome X. Hence, in the 
following, if we want to make a statement about the pre- 
dictive power of a given theory, 5 it is sufficient to consider 
the properties of the corresponding distributions Px\az- 



2 According to Naimark's theorem, there exists a Hilbert space "H 
that contains "H. as a subspace as well as orthogonal projectors 
P° in H such that for each x G X the POVM element E% is the 
projection of P™ into H. 

3 An example of a tomographically complete set of projective 
POVMs in the case of a single qubit are the three POVMs whose 
elements are projectors onto (i) |t) and |4-), (ii) (|t) + I4-))/v2 
and (|t) - \i))/V2, and (hi) (|t> +i ||))/V2 and (ft) - i\l))/V2. 

4 In [5], Z was modelled more generally as a system with input and 
output. For simplicity, we ignore this higher level of generality 
in this work. 

5 When referring to the predictive power of a theory, we mean 
predictions based on the value Z. 



C. Compatibility of predictions 

The predictions computed within two different theo- 
ries (e.g., quantum theory and an alternative theory) are 
generally not identical. Nevertheless, they may be com- 
patible with each other, in the following sense. Let Z 
and Z' be the parameters of two different theories, and 
let their predictions (about the outcome X of a measure- 
ment A) be given by conditional probability distributions 
Px\az and P X \AZ', respectively. 6 

Definition 1. Px\az and Px\AZ' are said to be com- 
patible if there exists a conditional distribution Pxzz'lA 
such that 7 

Px\az = ^PxZ'\az{-,z') V a, Z 

z' 

Px\az> =^2PxZ\az>(-,z) Vfl,z', 

z 

where the conditional distributions in the sums are de- 
rived from P X zz'\a- S 

To relate the definition back to the earlier example of 
the isotopes, by way of illustration, the chemical element 
could be specified by Z, and the particular isotope by 
Z' . The relevant predictions are then compatible in the 
above sense: since Z' is a fine-graining of Z (i.e., Z is 
uniquely determined by Z'), the second relation is trivial, 
while the first recovers the non-isotopic predictions by 
averaging over the different isotopes. 

We will use this notion of compatibility to compare 
quantum theory to alternative theories. For this, we let 
Z' = "if! be the quantum state of a system and consider 
the conditional distribution Px\av defined by ^2§. An 
alternative theory with predictions specified by Px\az 
(based on a parameter Z) can then be considered com- 
patible with quantum theory if there exists a distribution 
Pxz^\A such that both Px\a<b and Px\az can be recov- 
ered from it (in the sense of the above definition). 



D. Comparing the accuracy of predictions 

The predictive powers of different theories can be com- 
pared provided the theories are mutually compatible. 



6 Note that the conditional probability distribution Px\AZ (and, 
similarly, Px\AZ') m ay be defined only for a restricted set of 
pairs (a, z). 

7 We require that both sides of the equalities are defined for the 
same pairs (a,z) and (a,z'). 

8 That is, P X z'\az is given by 

P X Z'\ ag {x,z') =PxZZ>\a(x,Z,z')/P Z \a(z) (if P Z[a (z) > 0) 

where P Z | a (z) = J2 x ,z' p xzz'\a{%, z, *'), and likewise for 

PxZ\az'- 
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The idea is that a theory with predictions Px\az is o,t 
least as informative as another theory with predictions 
Px\az' if t ne latter can be obtained from the former, i.e., 
if the parameter Z' does not provide any information be- 
yond Z . This motivates the following definition. 

Definition 2. Let Px\az and Px\AZ' t> e compatible. 
Px\az is said to be (at least) as informative as Px\az> 
if there exists a conditional distribution Pxzz'\A as in 
Definition Q] such that 

Px\az = Px\azz> V ffl, Z, z' S.t. P Z Z'\a( Z ) z ') > , 

where Px\azz' an d Pzz'\a are the conditional distribu- 
tions derived from Pxzz'\A- 

This can again be illustrated using the earlier example 
of the isotopes. The theory that includes the information 
Z' about the particular isotope is of course at least as 
informative as the one that only specifies the chemical 
element Z, but Z is not as informative as Z' . 

We remark that quantum-mechanical predictions 
based on pure states are generally more informative than 
those derived from mixed states. To see this, imagine a 
system that is prepared in a pure state tpc depending on 
a random bit C, and assume that a measurement with 
outcome A is performed. If C is unknown, with C = 
and C = 1 being equally likely, the distribution of A is, 
according to quantum theory, given by |2]) with ip sub- 
stituted by the mixed state jV'o + However, if we 
had access to C, we could use Q with tp replaced by tpc, 
resulting in a more accurate prediction. 

Clearly, when studying the question of whether there 
can be more informative theories than quantum theory, 
we need to consider specifications of states and measure- 
ment processes that are maximally informative among 
all predictions that are possible within quantum theory. 
Hence, following the above remark, we will restrict our 
attention to quantum states that correspond to pure den- 
sity operators and to projective measurements. 

IV. FREEDOM OF CHOICE 

As explained above, physical theories involve certain 
parameters, and it is generally assumed (often implicitly) 
that these can be chosen freely. Quantum mechanics, for 
instance, allows us to compute the probabilities of a mea- 
surement outcome X depending on the system's state * 
as well as a description of the measurement process, A, 
and our understanding is that these parameters can in 
principle be chosen freely (e.g., by an experimenter car- 
rying out a measurement of her choice). In fact, one 
may argue that a description of nature that does not in- 
volve any such choices — thereby not allowing us to com- 
pute conclusions for different initial conditions — cannot 
be reasonably termed a theory [10] . 

It is worth noting that by assuming free choice, we 
are not making any metaphysical assertion that the real 



world contains, say, agents with free will, or anything 
of that sort. Instead, allowing free choice is a property 
that we require of a theory. In essence, it means that the 
theory gives predictions for all possible values of the free 
parameters, and furthermore, that it does so no matter 
what happened elsewhere in the theory. Without such 
an assumption, depending on other events described by 
the theory, certain values of the 'free' parameters could 
be unavailable, in the sense that the theory would not be 
able to predict a response to them. 

In this section, we specify what we mean by such free 
choices. The idea is that, for a given theory, a parameter 
of the theory, say A, is considered free if it is possible 
to choose A such that it is uncorrelated with all other 
values (described by the theory) except those that lie 
in the causal future of A. However, for this definition to 
make sense mathematically, we need to establish a notion 
of causal future, which we do next. 

A. Causal order 

Let L be the set of all parameters required for the de- 
scription of an experiment within a given theory. In par- 
ticular, L may contain variables that specify the (joint) 
state in which the relevant physical systems have been 
prepared (in the following usually denoted by ^ for quan- 
tum theory and by Z for more general theories), the 
choice of measurements (denoted A and B), as well as 
the measurement outcomes (denoted X and Y). For any 
such set of variables T, we can define a causal order ~» 
as follows. 

Definition 3. A causal order ~~» for F is a preorder re- 
lation 9 on L. If A ~-> X, we say that X is in the causal 
future of A. 

Note that the relation ~» can be conveniently specified 
by a diagram (see Figure [2] for two simple examples). 

To understand the physical relevance of the statements 
in this work, it is useful to interpret the relation A X 
as li A can be the cause of A." We stress that this is not 
meant to imply that there is an actual physical process 
such that changing A imposes a change of A, but rather 
that the existence of such a process is not precluded (by 
the theory). Conversely, if A ~-» X does not hold, we also 
write A 7A A and interpret this as A cannot be the cause 
ofX. 

A typical — but for the following considerations not 
necessary — requirement on a causal order is that it be 
compatible with relativistic space time. Consider, for ex- 
ample, an experiment where a parameter A is chosen at 
a given space time point and where a measurement 
outcome A is observed at another space time point Yx- 



That is, «"* is a binary relation on the set Y that is reflexive (i.e., 
A ~* A) and transitive (i.e., Z ~* A and A ~> X imply Z ~* X). 



6 





(a) 



(b) 



FIG. 2: Free choice and causal order, (a) An arbitrary causal 
order. The arrows correspond to the relation -•». For example, 
G lies in the causal future of F, i.e., F G, but not of J, 
i.e., J ■/-> G. Because of the transitivity property, the arrow 
from F to E is redundant. In this setting we would say that, 
for instance, G is free if it is uncorrected with F and J, i.e., 
Pgfj = Pg x Pfj- (b) The conventional causal order used for 
our argument. This causal order is natural because A «** X and 
B ~> Y are interpreted as spacelike separated measurements, 
and Z as some arbitrary additional information available before 
the measurement. Note, however, that the position of Z need 
not be as shown; it is sufficient that Z is not in the causal future 
of A or B. 



One would then naturally demand that A X if and 
only if Yx lies in the future light cone of r^. This cap- 
tures the idea that A can only be the cause of X if A is 
chosen before the observation X is made (with respect 
to any reference frame). 



V. CONSTRAINTS ON THEORIES 
COMPATIBLE WITH QUANTUM THEORY 

The debate about whether quantum theory could be 
replaced by a higher (possibly deterministic) theory has a 
long history (see also the introductory section). The com- 
mon feature of all proposed higher theories is that they 
would make more informative predictions than quantum 
theory. Here, we review some well-known results that im- 
pose constraints on such higher theories. We note that 
these constraints can be seen as special cases of the gen- 



eral theorem presented in Section VI which excludes all 
alternative theories whose predictions are more informa- 
tive than quantum theory. 



A. Bipartite setup 

The statements described below refer to a bipartite 
setup which involves two separate measurements, speci- 
fied by parameters A and B, and with outcomes X and 
Y, respectively. As before, we consider a theory that 
allows us to compute predictions about these measure- 
ments, based on a parameter (or list of parameters) Z. 
Furthermore, in order to define free choices, we need to 
specify a causal order. For concreteness, we take the 
causal order defined by Figure |2jb). We note, however, 
that the technical claims described in this section can be 
generalized to any causal order that satisfies the following 
conditions: 



B. Free random variables 

To define the notion of a "free choice", we consider a 
set r of RVs equipped with a causal order. (As above, 
r should be thought of as the set of all parameters rele- 
vant for the description of an experiment within a given 
theory.) 

Definition 4. We say that A 6 T is free if 
Paf a =P A y- Pv A 

holds, where Ta is the set of all RVs X e T such that 
A^X. 10 

Obviously, whether a variable from the set T is con- 
sidered free depends on the causal order that we impose. 
We remark that, if the causal order is taken to be the 
one induced by relativistic space time (see the descrip- 
tion above), then this definition coincides with the notion 
of a free variable as used by Bell [TP]. 11 We also remark 
that both standard quantum theory and classical theory 
allow for free choices within such a causal order. 



(i) A ~> X and B V; 

(ii) A Z and B ~h Z; 

(iii) A 7^ Y and B ?A X. 

Condition (i) corresponds to the requirement that the 
measurement is specified before its outcome is obtained. 
Condition (ii) captures the fact that the parameters of 
the theory, Z, on which the predictions are based, should 
not only become available after the measurement process 
is started. This assumption can be considered necessary 
in order to reasonably talk about "predictions" . Finally, 
Condition (iii) demands that the arrangement of the two 
measurements should be such that neither of them lies 
in the causal future of the other. (Note that, assuming a 
relativistic space time structure, this would correspond to 
a setup where the measurements are spacelike separated.) 
Together, the three conditions imply a causal order in 
which A is considered free if Pabyz — Pa x Pbyz, an d 
likewise for B. 



By definition, the set also excludes A. 

In |10l . Bell discusses the assumption that the settings of instru- 



ments are free variables, which he characterizes as follows: "For 
me this means that the values of such variables have implications 
only in their future light cones." 
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B. Local deterministic theories 

Local deterministic theories were introduced in the 
work of Bell [I] . Determinism means that the outcomes 
of measurements can be predicted with certainty, given 
access to the parameters of the theory, Z (sometimes 
termed "hidden variables" ) . Locality (or local causality) 
refers to the additional requirement that the predictions 
only depend on "local" parameters. Within the bipartite 
setup described above, this means that the measurement 
outcome X only depends on the choice of measurement 
A as well as Z, and, similarly, Y only depends on B and 
Z. Determinism and local causality together imply that 
all conditional probabilities Px\az( x ) and PY\bz(y) must 
be equal to either or 1. This property is also called 
local determinism. 

Bell's theorem now asserts that no locally determin- 
istic theory can reproduce the predictions of quantum 
theory. In order to do this, it is sufficient to consider the 
correlations PxY\ABip tria t quantum theory predicts for 
the measurements on the maximally entangled state \tpo) 
defined in Section Hi CI 

Theorem 1 (Bell's Theorem). Let A, B, X, Y and Z 
be RVs. Then at least one of the following cannot hold: 

• Freedom of choice: 12 A and B are free with respect 
to the causal order defined by Figure^b); 

• Compatibility with quantum theory: Pxy\abz * s 
compatible with PxY\ABip f or ^ = ^> 

• Local determinism: 

Px\az(x) € {0, 1} V a, z s.t. P AZ (a, z) > 

P Y \bz{y) g {o, i} v b, z s.t. p BZ (b, z) > o . 

To prove this theorem, we use the correlation measure 
7jv defined in Section |H C| The central idea is to show 
that, under the free choice assumption, all correlations 
explained by a locally deterministic model satisfy the in- 
equality 1% > 1, which corresponds to the CHSH inequal- 
ity [S]. (The free choice assumption ensures that Pab\z 
has full support for each z, and hence that the condi- 
tional distributions Px\az and Pyibz are well defined for 
any a, b, and z.) The assertion then follows from the fact 
that In(Pxy\ab4, ) = 2- \/2<lfor7V = 2 (see Eq. [lj). 

C. Stochastic local causal theories 

In his later work, Bell dropped the assumption of deter- 
minism and considered more general stochastic models. 



The freedom of choice assumption is often not mentioned explic- 
itly, but its necessity has been stressed by Bell in later work |10| . 



Bell's local causality criterion then corresponds to the re- 
quirement that P X y\abz = Px\azPy\bz [II]- Expand- 
ing the left hand side using Bayes' rule, this can be bro- 
ken down into four separate relations, Px\abz = Px\az> 
Py\abz — Py\bz, Px\abyz = Px\abz and Py\abxz = 
Py\abz- The first two of these have sometimes been 
termed parameter independence and imply that, even 
given access to Z, there cannot be signalling between 
the two measurement processes. 

The last two conditions have been termed outcome in- 
dependence. They do not have an obvious operational 
significance (such as no-signalling). We note, however, 
that they are automatically satisfied in any deterministic 
model, where each of the outcomes X and Y is a function 
of A, B, and Z. Conversely, as we argue below, if a the- 
ory fulfills these conditions then the predictions it makes 
about the outcomes of measurements on the entangled 
state IV'o) = ^TjGtt) + \ID) arc necessarily determinis- 
tic. 

To see this, note that for any projective measurement 
(specified by A = a) applied to the first part of \tpo), 
there exists another projective measurement (specified 
by B — b a ) on the second part such that the outcomes 
are perfectly correlated. For example, if A = a corre- 
sponds to the POVM {|t)(tUI)(!|}, and if we choose 
B = b a such that it corresponds to the same POVM, then 
-PxF|afc a (0,0) = i>jry]o6.(M) - \- This means that X 
is determined by Y, i.e., P x \ab aV z{x) = 6 x , y G {0, 1} for 
all a, x, y and z. Applying now the conditions of lo- 
cal causality, we obtain P x \abyz(%) = Px\az{x) € {0, 1}, 
which corresponds to the assumption of local determin- 
ism. Hence, Theorem [T] remains valid if we weaken the 
local determinism condition to Bell's local causality con- 
dition. 

We remark that, as we shall see below (Lemma [I]), the 
freedom of choice assumption implies parameter indepen- 
dence, but is not strong enough to imply local causality, 
since it doesn't imply outcome independence. 



D. Leggett-type theories 

In [12], Leggett introduced what he calls a "non-local 
hidden variable" model. Since the behaviour of the non- 
local variables is not specified in Leggett's model, we pre- 
fer to think of his model in terms of its additional local 
components (i.e. as a partially local model). We note that 
the model is not a full-fledged theory, as it only specifies 
how the outcomes of spin measurements are obtained. 

Leggett's model is based on the idea of assigning to 
each spin particle a three-dimensional vector (in addi- 
tion to its quantum mechanical state). In particular, if 
we consider two spin particles, each measured on one side 
within the bipartite setup described above, we need to 
specify two such vectors, denoted u and v, respectively. 
To connect this to our general discussion, we may think 
of these vectors as part of Z, i.e., Z takes as values pairs 
(u, v). As above, we denote the choice of measurement 



on each side by A and B. Restricting to projective spin 
measurements, the two choices may be labelled by three- 
dimensional vectors, denoted a and b, respectively, indi- 
cating their orientation in space (see, for example, [13] 
for more details). The predictions for the measurement 
outcomes X and Y, as prescribed by Leggett's model, 
are then given by 

JVuv(z) = ^(l + (-lfa-u) (3) 

P Ylb u V (y) = i(i + (-i)*b.v). (4) 

In order to completely define the model, one would 
also need to assign probabilities to all possible values 
Z = (u, v), i.e., specify a probability distribution Pz 
(which, in general, depends on the quantum state). How- 
ever, the following theorem implies that, for no such as- 
signment, Leggett's model can be made compatible to 
quantum theory. 

Theorem 2. fTMWj Let A, B, X, Y and Z be RVs. 
Then at least one of the following cannot hold: 

• Freedom of choice: A and B are free with respect 
to the causal order defined by Figure^b); 

• Compatibility with quantum theory: Pxy\abz * s 
compatible with the predictions Pxy\AB^ of quan- 
tum theory for measurements on a maximally en- 
tangled state \ipo); 

• Leggett rule: Pxy\abz satisfies Eqs. [3j and^for 
all ualues A = a, B = b, and Z = (u, vj . 

We will not give a proof of this theorem here, since 
it follows from the more general results presented in 
the next section. To see this, it is sufficient to ob- 
serve that, when measuring the entangled state |-0o) — 
^75 ( I TT) + III)); f° r instance, quantum theory prescribes 
that Px\a( x ) = 1 > independently of the orientation a of 
the measurement. Conversely, Leggett's model predicts a 
non-uniform distribution whenever the measurement ori- 
entation a is not orthogonal to the vector u. The Leggett 
model is therefore more informative than quantum the- 
ory, and hence excluded by Lemma[3](as well as the more 
general Theorem [3| below. 

E. Other Constraints 

Here we summarize a few other known constraints on 
theories compatible with quantum mechanics. One of the 
first results in this direction was that the quantum out- 
comes cannot be predetermined within a non-contextual 
model [3] . In such a model, one assumes the existence 
of a map from the set of projectors to the set {0, 1} such 
that for every set of projectors that constitute a POVM, 
only one member of that set is mapped to 1 (the element 
that maps to 1 is interpreted as the outcome that will oc- 
cur if a measurement described by that POVM is carried 



out). Such a model is non- contextual in that whether or 
not a particular outcome occurs depends only on the indi- 
vidual projectors, and not on the set of projectors making 
up the POVM. The results of Kochen-Specker and of 
Bell [3] imply that no such assignment can exist if the 
Hilbcrt space dimension is at least 3. 

Hardy [16] later showed that within any extended the- 
ory, an infinite number of underlying states are required, 
even to describe a single qubit, and Montina [T71 [TH] 
proved, under the assumption of Markovian dynamics, 
that the number of real parameters that an extended 
theory needs to characterize a state in Hilbert space di- 
mension TV is at least 2N — 2 (the same as the number 
of parameters needed to specify a pure quantum state up 
to global phase). 

In addition, a claim in the same spirit as our non- 
extendibility theorem (presented in the next Section) has 
been obtained recently under the assumption of non- 
contextuality [19]. 

VI. THE NON-EXTENDIBILITY THEOREM 

This section is devoted to the key result of this arti- 
cle, asserting that quantum theory is maximally infor- 
mative. Stated informally, we make the following claim, 
first made in [5]. 

Claim 1. No alternative theory compatible with quantum 
theory and satisfying the freedom of choice assumption 
can give improved predictions. 

This claim generalizes the results of Bell and Leggett 
discussed in the previous section. The setup is broadly 
the same, but instead of the condition that the higher 
theory remains compatible with quantum theory for mea- 
surements on maximally entangled states, we require this 
for a wider class of states. Furthermore, rather than con- 
sidering theories that satisfy local determinism or the 
Leggett rule, the claim is about arbitrary theories that 
make improved predictions. 

The main technical theorem is as follows. 

Theorem 3. Let \4>)sd ^ e a P ure state and let {\y) D } be 
a Schmidt basis on D. Then there exists a state \V)g£, 
and local POVMs {E%} and {F%} on SS and DD, re- 
spectively, with Fy° — \y)(y\ D ® Afj for some b = bo, such 
that, for any RVs A, B, X, Y and Z, at least one of the 
following cannot hold: 13 

• Freedom of choice: A and B are free with respect 
to the causal order depicted in Figure\§(b); 1A 



lA Strictly speaking, the entangled state and POVMs should be se- 
quences of entangled states and POVMs, for which the maximum 
improvement in the prediction tends to (c.f. Lemma pi. 

14 More generally, the statement holds for any causal oraer that 
satisfies the three conditions given in Section IV A| 
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• Compatibility with quantum theory: Pxy\abz 
is compatible with the prediction Pxy\AB(4>®T) °f 
quantum theory for the measurements {E®} and 



{F b y } on 



>SD 



|r>^- 15 



• Improved predictions: pY\b cf> is not as informative 

as P Y \ab Z- 

To understand the implications of this theorem, con- 
sider a fixed measurement a on a system S. Assume 
that, according to quantum theory, the system (before 
the measurement) is in a pure state, denoted -0, and that 
the measurement corresponds to a projective POVM, 
{-Eg}. Quantum theory then gives a probabilistic pre- 
diction P x \aip f° r the measurement outcome X, which 



depends on ip and {£/?} (see Eq. |2j). Our aim is to com- 
pare this quantum-mechanical prediction with the pre- 
diction P x \o,z that may be obtained by an alternative 
theory, whose parameters we denote by Z . 

In order to relate this to the theorem, let us assume 
that the freedom of choice condition holds, from which it 
follows that the alternative theory is no-signalling. We 
then consider the joint state of the measured system, S, 
and the measurement device, D, after the measurement 
a. Following the discussion in Section |III| according to 
quantum theory, this state can be assumed to have the 
form 



' SD 




(5) 



Note that the POVM {F^°} defined by Theorem M corre- 
sponds to a measurement of D in the basis The 
outcome, Y, of this measurement can therefore be seen 
as a copy of the outcome X of the original measurement, 
specified by {£"?}. In particular, the prediction that any 
theory compatible with quantum theory makes about Y 
must be identical to the prediction it makes about X, 
i.e., we have 



P- 

r X\ail> 



Y\b <k 



P X\aZ - P Y\b Z ■ 

(Note that because the free choice assumption implies 
that the alternative theory is no-signalling, the prediction 
the alternative theory makes about Y does not depend 
on a, for example.) 

We now apply Theorem [3] to \<f) SD . If we assume that 
the alternative theory, in addition to being compatible 
with quantum theory, satisfies the freedom of choice as- 
sumption, then the third condition of the theorem cannot 



15 Formally, P X Y\AB(4>®T) is given by 

PxY\ a bmv){^, v) = tr((js° ® if) \<p cg> rx^ ® r|) . 



hold, i.e., PY\b 4> is as informative as Py\b z- Using the 
above identities, this directly carries over to the original 
measurement a, i.e., the quantum-mechanical prediction 
P x \aip i s as informative as the prediction PxjaZ °f the- 
alternative theory. We hence establish Claim [l] 



VII. PROOF OF THEOREM [3] 

The theorem follows from three statements, which we 
formulate and prove separately. An overview of the ar- 
gument is as follows. We consider the previously intro- 
duced bipartite scenario and the causal order depicted in 
Figure [5Jb). We begin by showing that free choice with 
respect to this causal order implies that the alternative 
theory is no-signalling (see Lemma|TJ . In the second part 
of the argument, we show that for measurements on max- 
imally entangled states, if quantum theory is correct, no 
higher theory can give improved predictions about the 
outcomes (see Lemma [3]). In the final part of the ar- 
gument, we generalize this to measurements on an arbi- 
trary bipartite entangled state. More precisely, we show 
that for any such state, there exist local measurements 
that generate correlations arbitrarily close to those gen- 
erated by r maximally entangled states for some suffi- 
ciently large integer r. Hence, from the second part of 
the argument, these measurements can have no improved 
predictions. 



A. Part I: No-signalling from free choice 

In this part, We show that if A and B are free choices, 
then there is no signalling within the model (i.e. no sig- 
nalling even given access to Z). 

Lemma 1. The freedom of choice assumption implies 
Pxz\ab — Pxz\a an d Pyz\ab — Pyz\b- 

Proof. That A is free within the specified causal order 
implies Pa\byz — Pa and hence 



P 



YZA\B 



= P 



YZ\B x Pa\BYZ 



= PaxP 



yz\b 



and 



Pyza\b — Pa\b x Pyz\ab = Pa x P) 



YZ\AB ■ 



We therefore have Pyz\AB — Pyz\b- The relation 
Pxz\ab = Pxz\A follows by symmetry. □ 



B. Part II: Non-extendibility for Bell 
measurements 

In the second part of the argument, we show that the 
claim holds for particular measurements on maximally 
entangled pairs of qubits. The proof uses the correla- 
tion measure In introduced in Section II C The following 



lemma shows that this measure, applied to a distribution 
Pxy\ab, gives a bound on how well any additional infor- 
mation, Z, can be correlated to the outcome X. Note 
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that the lemma is independent of quantum theory and is 
simply a property of probability distributions. 

Lemma 2. Let Pxyz\ab be a distribution that obeys 
Pxz\AB = Pxz\A and P YZ \AB = Pyz\b- Then, for all a 
and b, we have 



(D(P xlabz ,P x )) z < -I n (Pxy\ab) , (6) 



where (-) 2 denotes the average over the values of Z (dis- 
tributed according to Pz\ a b)> an d Px denotes the uniform 
distribution on X . 

The proof is based on an argument given in [5] , which 
develops results of [50], [5T] and [T5] , 



Proof. We first consider the quantity Ijv evaluated for the 
conditional distribution P X y\ab z = Pxy\Abz{','\;;z), 
for any fixed z. The idea is to use this quantity to bound 
the variational distance between the conditional distribu- 
tion Px\az and its negation, 1 — P x \ a zi which corresponds 
to the distribution of X if its values are interchanged. 
If this distance is small, it follows that the distribution 
Px\az is roughly uniform. Because this holds for any 
Z = z, X must be independent of Z . 

It is first worth noting that the conditions of the 
lemma (P X z\ab = Pxz\A and P Y z\ab = Pyz\b) im- 
ply Px\abz = Px\az and P y \abz = Py\bz respectively, 
and together imply Pz\ab = Pz- 

Let P x be the uniform distribution on X. For ciq := 0, 
b := 2N — 1, we have 

In(Pxy\abz) 

= P{X = Y\a ,b ,z) + P(X ¥=Y\a,b,z) 

a-b| = l 

> -0(1 - Px a b z 

)+ Yl D ( P X\abz,PY\abz) 
|a-6| = l 

= D(l-Px lao z,P Y \b z)+ D ( P X\az,PY\bz) 



n€A N , b€B N 
|o-b|=l 



>D(l-P X \ an z,Px\a z) 
= 2D(Px\a b z,Px) ■ 



(7) 



The first inequality follows from the fact that 
D(P xla ,P Y \n) < P{X £ Y\Q) for any event (see 
Lemma |6] in Appendix [A]). Furthermore, we have used 
the conditions P X \abz = Px\az and Py\ ahz = P Y \bz, and 
the triangle inequality for D. By symmetry, this relation 
holds for all a and b. 

We now take the average over z on both sides of Q. 



The left-hand-side gives 

X! P Z\ab(z)pN(PxY\ABz) 

z 

= Y,Pz(z)In{Pxy \ABz) 

z 

= J2 p z\a b„ {z)P{X - Y\a , b , z)+ 

Z 

J2 P z\ab(^)P(X^Y\a,b,z) 



o-£A N ,b£B N z 
a-6| = l 



= P(X = Y\a Q ,b a )+ £ P(X^Y\a,b,c) 

a£A N , b£B N 

|a-b|=l 

= Pn(Pxy\ab) , (8) 



where we used the condition Pz\ a b = Pz several times. 
Furthermore, taking the average on the right-hand-side 
of yields 

yPz\ab(z)D(P X \abz,P X ) = D(P XZ \abj Px X Pz\ab) i 



which is equivalent to the left-hand side of □ 

We now apply Lemma [2] to the quantum correlations 
PxY\abxp arising from measurements on the maximally 



entangled state V'o (cf. Section II C). In the limit where 
N tends to infinity, we have linijv->oo In{P X y\ab) = ^' 



and hence we can establish that P 



X\abz 



Px\ab^ a for all 



a, b and z with Pabz\^ ((i, b, z) > 0. Under the freedom 
of choice assumption and assuming compatibility with 
quantum theory (note that Px\abip ( x ) = Px( x ) = \ f° r 
both x = and x = 1) this implies Px\az = -Pxio^o fo r 
all a and z with Pz\ a ( z ) > 0- This means that Z gives no 
additional information about the measurement outcome, 
X. 

Taking Parts I and II together, we obtain the following 
lemma, which may be of independent interest. 

Lemma 3. For any 5 > there exists an N £ N such 
that for any RVs A, B , X , Y and Z , at least one of the 
following three conditions cannot hold: 

• Freedom of choice: A and B are free with respect 
to the causal order depicted in Figure^b); 

• Compatibility with quantum theory: Pxy\abz * s 
compatible with Pxy\ab4> ' 

• Improved predictions: There exists a value A = 
a such that (D(P x \ az , Px\ai> ))z > S, where (■) z 
denotes the expectation value over z. 

Hence, if an alternative theory is compatible with 
quantum theory and satisfies the freedom of choice as- 
sumption then the third condition cannot hold, i.e., 
(D(Px\ az ,Px\aip ))z < Since 5 can be arbitrarily 
small, this implies that quantum theory is as informa- 
tive as the alternative theory. 
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C. Part III: Generalization to arbitrary 
measurements 

The last part of the proof of Theorem [3] consists of 
generalizing Lemma |3j which applies to specific measure- 
ments on a maximally entangled state, to measurements 
on the general state \4>)sd- The proof relies on the con- 
cept of embezzling states [52] . These are entangled states 
that can be used to extract any desired maximally entan- 
gled state locally and without communication. More pre- 
cisely, we will use the following lemma, which is implicit 



Lemma 4. For any 5 > and for any k £ N there exists 
a bipartite state |r fe )^, the embezzling state, such that 
for any m < k, there exist local isometries, U m and V m , 
on S and D, respectively, that perform the transforma- 
tion 



l rfe ) 



SD 



with fidelity at least 1 — 5, where ipo denotes a maximally 
entangled state of two qubits. 

Note that the state \4>) $ D considered in Theorem [3] can 
be represented by its Schmidt decomposition as 



/SD 



E 



IPy 



We now consider an embezzling state on SD and use 
Lemma [4] to define isometries U and V on SS and D D, 
respectively, which are controlled by the entry y in the 
registers S or D, and build up m(y) bits of entanglement 
between registers S' and D' , i.e., 



Section II C More precisely, we define 

r 

K ■= & ■ [(<£) \(ik + *i)*)((ik + X M) D ® 1 



D 



■ u 



Fy ■■= ^ ■ 



■ v 



with a = (ax, . . . , a r ) € A^ r and b = (b±, 
for some large N. In addition we define 



• A! 



pb„ _ 

y 



S ' 



1 



s 



Assume now that the freedom of choice as well as the 
compatibility with quantum theory assumption are sat- 
isfied. Furthermore, let X = (X,X r ) and Y = Y be 
the outcomes of the measurements A = do := (0, . . . , 0) 
and B = bo, respectively. By choosing the orientation of 
the vectors |f) and |i) of Section II C appropriately, we 
can arrange it so that quantum theory predicts that the 
outcomes of the measurements of ao and bo are in agree- 
ment, in the sense that X — Y holds with probability 
1. Hence, together with the no-signalling conditions (c.f. 
Lemma [l]) we find that 

■FV|&oW>i»r) = Pjt\a w®r) 



Pv\b z - Px\a Z 



|aoW-®r) 



must be as informa- 



Lemma |3| implies that, Px „ u . 

tive as Px\a z- In particular, the same relation holds for 
the marginals of these distributions, i.e, Px\ aQ (ipi»T) * s as 
informative as P X \ ao z- Combining this with the above 
identities we find that PY\b tp = PY\b (ip®r) is as informa- 
tive as pY\b z> thus concluding the proof of Theorem [3j 



y 
y 

The integers m(y) are chosen such that the state resulting 
from applying U®V to \4>) SD <8> |r fc ) is close to a state 
of the form 

m(y) 

(2- r/2 EE \v>&)ss> ® \y,y')nn) ® |r fc )^ , 

with ^2 y m(y) — 2 r , for some integer r. (This can 
be achieved to arbitrary precision for sufficiently large 
k and m(y).) Note that the first part of this state 
corresponds to r maximally entangled pairs, ipo' r > 
tween the registers SS' and DD'. We now construct 
the POVMs {E%} and {Fy} by concatenating the oper- 
ations U and V with the projective measurements along 
the vectors |(^ + Xi)ir) and |(^ -I- yi)n) introduced in 



VIII. ALTERNATIVE THEORIES ARE 
EQUIVALENT TO QUANTUM THEORY 

In this section, we discuss an implication of the non- 
extendibility theorem (Theorem |3f to a long-standing 
debate on the nature of the quantum mechanical wave 
function. The debate centres around whether it should 
be interpreted as a subjective quantity, for example a 
state of knowledge about some underlying physical re- 
ality, or whether it should instead be interpreted as ob- 
jective (real). 16 The wave function could be considered 
subjective if there existed an alternative theory, with pre- 
dictions based on a parameter Z, that is at least as in- 
formative as quantum theory, and in which two different 
wave functions, say ip and ip 1 , are compatible with the 



Note that in some subjective interpretations (e.g. [213) there is no 
underlying physical reality — the wave function is simply a state 
of knowledge about future measurement outcomes and nothing 
more. 
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same value of the parameter, say Z — z. Formally, this 
would mean that there exist z, ip, and ip' 7^ ip such that 
Pz^{z,ip) > and Pz^{z,ip') > 0. This is sometimes 
called a i/i-epistemic view of the wave function and con- 
trasts with the i/j-ontic, or objective, view [21] (we refer 
to [23l EH [26] for arguments in favour of the -0-epistemic 
view). In the latter, the wave function is uniquely deter- 
mined by the parameters of any alternative theory that 
is at least as informative as quantum theory, i.e., there 
exists a (deterministic) function, / such that ^ = f(Z). 

Our result is based on the following simple lemma, 
which asserts that, if an alternative theory is equally 
informative as quantum theory then the wave function 
is indeed uniquely determined by the parameter of the 
alternative theory. 

Lemma 5. Suppose {Ef.} form a tomographically com- 
plete set of POVMs, and A, X, ^ and Z are RVs such 
that: 

• A is a free choice with respect to a causal order in 
which A Z and A 7A $ . 

• Px\az is a t least as informative as Px\A^>, where 

P X la^(x)=tT(E^). 

• Px\A^ is a t least as informative as Px\AZ- 
Then there exists a function, f , such that ^ — f(Z). 

Proof. If Px\az is at least as informative as Px\A^, then 
there exists a distribution Pxz^\A such that 

Px\a^ = } J Pxz\ail>{-,z) V a, ip 



P 



X\az 



■4, 



Px^\az(;ip) V a, z. 



(we drop the bar on P in the following, and simply use 
Pxz9\A to denote this distribution). We have 

Px\azip — Px\az 

for all a,z,ip that have a non-zero joint probability, i.e., 
PAZ^ia, z, ip) > 0. Likewise, if Px\a^ is at least as in- 
formative as Px\az then 

Px\azif> = Px\aip 

holds under the same condition. Combining these ex- 
pressions gives 



X\aif> 



X\az 



(9) 



If A is a free choice, we have Paz* = Pa x Pz*, hence ^ 
holds provided that Pzy(z,ip) > and Pa (a) > 0. 

Let now z, ip and ip' De such that Pz^(z,ip) > and 
Pz^{z,ip') > 0. From (p)}, this implies P X \ a ^ = P X \aip> 
for all a such that Pa(o) > 0. Since the set of measure- 
ments with Pa (a) > is tomographically complete, this 
can only be satisfied if ip = ip'. It hence follows that 
there exists a function / such that "J = f{Z). □ 



Combining Theorem[3]with Lemma[5j we can establish 
the main result of this section, which we state informally 
as follows. 

Claim 2. In any alternative theory that is at least 
as informative as quantum theory and satisfies the free 
choice assumption, there is a one-to-one correspondence 
between the parameters of the alternative theory and the 
quantum state (up to a possible removable degeneracy 17 
in the parameters of the alternative theory). 

To establish this, as before, we use Z to denote the 
parameters of the higher theory. Theorem [3] shows that 
under the free choice assumption, quantum theory is at 
least as informative as any alternative theory. We hence 
satisfy the conditions of Lemma [5] so find ^ = f(Z), 
for some function /. Furthermore, since Z cannot im- 
prove the predictions for any ^ = ip, any z in f (ip) 
must give identical predictions. Hence, if f~ 1 {ip) con- 
tains more than one element, this corresponds to a re- 
movable degeneracy in the parameters of the alternative 
theory. 



Related work 

An interpretation of the wave function as a subjec- 
tive state of knowledge about some underlying theory 
has also been ruled out by Pusey et al. [27] via a differ- 
ent argument using different assumptions which we now 
summarize. They consider the preparation of multiple 
quantum systems, with states ^i, where each system is 
associated with a particular parameter in the higher the- 
ory, Zi. Pusey et al. assume that the joint distribution 
of these is product, i.e. 



PziZ 2 



.*1*2- 



P 



Zi*i 



X Pz,*, X 



(10) 



Starting from this assumption, they show that there can- 
not exist two distinct states, ip and ip' , such that for each i 
there exists a value of Zi = zi satisfying Pz^i {z%, tp) > 
and Pz^^z^ip') > 0. 

We note that the product nature of the joint distribu- 
tion, Eq. (10), is related to free choice of preparation. In 



particular, it implies 



P 



'S' 1 Z2...Z N , if2---'^t 



P* x x P z 



If we take the causal order to be such that ^ 7A 4^ 
and Zj for j ^ i (as would be natural if we make 

spacelike separated preparations) , then this is equivalent 
to saying that \&i can be chosen freely. 



17 Any degeneracy is removable in the sense that it has no oper- 
ational effect, i.e., one can define another theory without the 
degeneracy (but otherwise identical) without affecting the pre- 
dictive power. 
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It was subsequently noted [55] that the separability 
assumption can be weakened, in essence to the assump- 
tion that there exists a particular set of parameters in 
the higher theory that are compatible with every prod- 
uct state composed of ip and ip' , i.e., there exist values of 
the parameters, z\, . . . , zjv, such that 

Pz 1 ...z N <a^...^ N {zx, ■ ■ ■ , zn,iP (/) , ■ ■ ■ > 0, 

where each ip('> is independently either ip or tp' (so that 
the above represents 2 N conditions). This condition can 
be further weakened [29] such that the parameters of the 
alternative theory for multiple systems need not be made 
up only of the individual parts, but could be replaced or 
supplemented with global parameters (provided these are 
also compatible with all the product state preparations) . 

An alternative argument against an interpretation of 
the quantum state as a state of knowledge about an un- 
derlying reality can be found in [30] . 

IX. DISCUSSION 

The main statements described in this article about 
the completeness of quantum theory are based on two 
assumptions. One of them is that quantum theory is 
correct, and is implicit in the question of completeness. 
The other is that of free choice within a natural causal 
structure. It is worth commenting on the existence of 
alternative models that are not compatible with this as- 
sumption. 

A prominent example is the de Broglie-Bohm 
model [3T1 [35] which recreates quantum correlations, pro- 
viding higher explanation in the form of hidden particle 
positions. These can be thought of as parameters of a 
higher theory that would allow perfect predictions of the 
outcomes. However, introducing these parameters comes 
at a price: it is incompatible with the freedom of choice 
assumption of our theorems. In fact, for the bipartite set- 
ting discussed above, if Z includes the particle positions 
of the de Broglie-Bohm model, we have some non-local 
behaviour, so that Px\abz = Px\az, for instance, does 
not hold. Thus, given Lemma [I] it follows that A and B 



cannot be free choices with respect to the causal order of 
Figure [^b). 

There are at least two ways to avoid our conclusions. 
The first is to maintain free choice, but assume that the 
alternative theory has a different causal structure (in par- 
ticular, one in which either A 7A Y or B X does not 
hold). The second is to give up the freedom of choice 
within the alternative theory, so that the measurement 
choices A and B may depend on the additional parame- 
ters Z (sometimes, this view is argued for by imagining 
that the additional parameters are permanently hidden) . 

One may take the view that the freedom of choice 
assumption, which demands complete independence be- 
tween the chosen settings and the other variables, is rel- 
atively strong, and perhaps contemplate alternative the- 
ories where this assumption is weakened. Some results 
in this direction can be found in [33], where a theorem 
similar to Lemma [3J is established under a relaxed free 
choice assumption, and provided there is no signalling at 
the level of the underlying theory. 

Finally, we note that the result presented here has 
a generic application in quantum cryptography. Stan- 
dard security proofs for schemes such as quantum key 
distribution ^34, 35] are based on the assumption (usu- 
ally not stated explicitly) that quantum theory is com- 
plete. If this were not the case, it could be that a scheme 
is proven secure within quantum theory, yet an adver- 
sary can break it by exploiting information available in a 
higher theory. However, the non-extendibility theorem, 
Theorem [3] implies that it is sufficient to make only the 
weaker assumption that quantum theory is correct, since 
this implies completeness. 
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Appendix A: Variational Distance 

The following is a list of the main properties of the 
variational distance £>(•, •) used in this work: 

• D(-, •) is a metric on the space of probability dis- 
tributions. 

• D(-, •) is upper bounded by 1. 

• The variational distance of marginal distributions 
cannot be larger than that of the joint distribu- 
tions: D(P x ,Qx) < D(P XY ,Qxy) for any P XY 
and Qxy- 

• It is convex: If {a^} satisfy a,; > and 
J2i a i = 1' an( l {Pjc} an d {Qx} are sets °f dis- 
tributions over X, then D(J2i a iPxi Si a iQx) — 
EiCHD(P x ,Q x ). 

• For a joint distribution Pxy, the variational distri- 
bution of the marginal distributions is bounded by 
the probability that the RVs X and Y have differ- 
ent values: D(P X ,P Y ) < P{X ^ Y). 

The first four properties follow straightforwardly from 
the definition. The last is proved in the following. 

Lemma 6. Let X and Y be two random variables jointly 
distributed according to Pxy ■ Then the variational dis- 
tance between the marginal distributions P x and Py is 
bounded by 

D(P X ,P Y )<P{X^Y). 
Proof. Let P* Y := P XY 

\ X ^y be the joint distribution 
of X and Y conditioned on the event that they are not 
equal. Similarly, define P XY '■— Pxy\x=y- We then have 

Pxy =P*P%y + (1-1¥)PZy 

where := P(X ^ Y). By linearity, the marginals of 
these distributions satisfy the same relation, i.e., 

p*p£ + (i-j*)^ 



p 



X 

Py = P*P$ + (1-P*)P? ■ 
Hence, by convexity of the variational distance, 

D(P x ,Py) < j*£>(P&if) + (l-JvW^,^) 

where the last inequality follows because the variational 
distance is at most 1, and D(P^ , Py ) = 0. □ 



